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Abstract 

Voters from m disjoint constituencies (regions, federal states, etc.) are represented in an 
assembly which contains one delegate from each constituency and applies a weighted voting 
rule. All agents are assumed to have single-peaked preferences over an interval; each delegate's 
preferences match his constituency's median voter; and the collective decision equals the 
assembly's Condorcet winner. We characterize the asymptotic behavior of the probability of 
a given delegate determining the outcome (i.e., being the weighted median of medians) in 
order to address a contentious practical question: which voting weights W\, . . . , w m ought to be 
selected if constituency sizes differ and all voters are to have a priori equal influence on collective 
decisions? It is shown that if ideal point distributions have identical median M and are suitably 
continuous, the probability for a given delegate i's ideal point A, being the Condorcet winner 
becomes asymptotically proportional to i's voting weight W[ times A/s density at M as m — > oo. 
Indirect representation of citizens is approximately egalitarian for weights proportional to the 
square root of constituency sizes if all individual ideal points are i.i.d. In contrast, weights that 
are linear in - or, better, induce a Shapley value linear in - size are egalitarian when preferences 
are sufficiently strongly affiliated within constituencies. 
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1 Introduction 



The voting weights of delegations to electoral assemblies with a federal or divisional 
structure commonly vary in the size of the represented populations, but do so very 
differently. The US Electoral College, for instance, involves voter blocs that are broadly 
proportional to constituency size: each state has two votes (reflecting its two seats 
in the Senate) in addition to a number which is proportional to population (like 
House seats). California and Wyoming comprise around 11.9% and 0.2% of the US 
population, respectively, and thus end up holding around 10.2% and 0.6% of votes 
on the US President. In contrast, the most and least populous member states of 
the EU - Germany and Malta - currently have about 8.4% and 0.8% of votes in the 
Council of the European Union but comprise 16.3% and 0.1% of the EU population; 
the respective mapping from population size to voting weight is, very roughly, a 
square root function. 1 Delegates in other collective decision-making bodies, such 
as the Governing Council of the European Central Bank, the Senate of Canada or 
German Bundesrat, and many a university senate or council of a multi-branch NGO, 
have voting weights that are yet more concave functions of the number of represented 
constituents, or even flat. 

This paper concerns two-tier voting systems in which individuals vote on delegates 
or representatives in disjoint constituencies (bottom tier) and these representatives 
take collective decisions in a council, electoral college, or other assembly (top tier). 
It investigates a practically relevant, normative question: which simple function - 
possibly linear, possibly strictly concave or constant - should determine the top-tier 
voting weights of delegates from differently sized constituencies such as US states or 
EU member countries? The considered objective is not one of efficiently aggregating 
private information (see, e.g., Bouton and Castanheira 2012) or of maximizing a 
utilitarian measure of welfare as investigated, for instance, by Barbera and Jackson 
(2006). We focus on the egalitarian criterion of 'one person, one vote' and on providing 
all bottom-tier voters, at least a priori and under very stylized ideal conditions, 
with equal influence on the collective decision. This is studied in a model where 
the respective median voter determines a constituency's top-tier policy position 
and the assembly's Condorcet winner defines its collective decision. The relation of 
heterogeneity within each constituency and heterogeneity across constituencies turns out to 
be the critical determinant of the fair voting weight allocation. Linear and square 
root weighting rules emerge in particularly prominent benchmark cases. The former 
is advisable for electorates that are polarized along constituency lines, i.e., exhibit 
significant heterogeneity across constituencies; while the latter is more egalitarian 

X A least squares power-law regression of EU Council voting weights itf; on population sizes n, 
results in W{ = c- rfi A7 with R 2 w 0.95. The current Council voting rules involve two other but essentially 
negligible criteria, and will be changed in 2017 into a more proportional system. 
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when heterogeneity within each constituency is dominant. 

The 'one person, one vote' principle is linked to the requirement of anonymity 
in social choice, that is, collective decisions shall depend only on the votes that the 
alternatives receive, not on whose votes these are. This general egalitarian norm is 
sometimes considered the minimum requirement for a decision-making procedure to 
be called 'democratic' (e.g., Dahl 1956, p. 37). It is straightforward to implement - 
at least in theory - in case of a direct, single-tier voting procedure or a two-tier 
one with symmetric constituencies. Complications arise when a two-tier system 
is asymmetric. A non-trivial integer apportionment problem already needs to be 
resolved for those assemblies, like parliaments, in which delegates from the same 
constituency can split their votes and thus reflect heterogeneity among constituents 
(see Balinski and Young 2001). And apportionment gets much more difficult when all 
representatives of a constituency vote as a bloc (as in the US Electoral College, with 
two exceptions) or, equivalently, when the assembly contains a single delegate from 
each constituency who is endowed with a voting weight that varies in population size 
(as in the EU Council). 

A way to adapt the principle to such situations has very vaguely been suggested 
by the US Supreme Court, requiring "that each citizen have an equally effective voice 
in the election" (cf. Reynolds v. Sims, 377 U.S. 533, 1964, p. 565; emphasis added by 
the authors). Here, we operationalize equal efficacy or influence by comparing the 
a priori probabilities of individual voters being decisive or pivotal for the collective 
decision. The corresponding joint event of (i) a given voter determining her delegate's 
vote and of (ii) this representative determining the assembly's collective decision is 
admittedly a rare one. Still, while all being close to zero, the resulting individual pivot 
probabilities can vary widely across constituencies when weights are chosen arbitrarily. 
They should not if an institutional designer wants to fix voting weights (or bloc sizes) 
which are fair at least from behind the constitutional 'veil of ignorance' - that is, when 
preference patterns of the day are ignored for practical or normative reasons. 

The objective of equalizing the a priori influence of each citizen on collective 
decisions was first formally considered by Lionel S. Penrose in 1946, when the 
institutional design of a successor to the League of Nations - today's United Nations 
Organization (UNO) - was being discussed. 2 Penrose (1946) showed that the most 
intuitive solution to the weight allocation problem, i.e., weights proportional to 
constituency sizes, ignores "elementary statistics of majority voting". Namely, if 
there are only two policy alternatives ('yes' and 'no') and all individual decisions are 
statistically independent and equiprobable then the probability of an individual voter 
being pivotal in her constituency with U{ voters, which for odd n, corresponds to 

informal investigations date back to anti-federalist writings by Luther Martin, a delegate from 
Maryland to the Constitutional Convention in Philadelphia in 1787. See Riker (1986). 
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the probability of n, - 1 voters being divided into 'yes' and 'no'-camps of same size, 
is approximately V2/ -\Jnni (apply Stirling's formula when evaluating the binomial 
distribution function). So a voter from a constituency C, which is four times larger than 
constituency C ; a priori faces a smaller probability of tipping the scales locally; but this 
probability is still half rather than only a quarter of the reference one. Consequently 
top-tier voting weights should be such that the pivot probability of constituency C; 
at the top tier is twice - not four times - that of C, in order to equalize the indirect 
influence of all citizens. 

The corresponding practical suggestion is also known as the Penrose square root rule. 
Despite criticism that it treats voting decisions too much like coin tosses, the rule has 
provided a benchmark for numerous applied studies which consider the distribution 
of voting power in the EU, US, or IMF (including Felsenthal and Machover 2001, 
2004; Grofman and Feld 2005; Fidrmuc et al. 2009; Leech and Leech 2009; Miller 
2009, 2012; Kirsch and Langner 2011). And though practitioners may not care about 
Penrose's reasoning itself - for instance, when the EU heads of state and government 
bargained on new, post-2017 voting rules for the Council - they have invoked Penrose's 
suggestion when it fitted their interests. 3 

The special role of square root weight allocation rules has been confirmed, qualified, 
and disputed in a number of studies on two-tier voting systems, both empirically 
(see Gelman et al. 2002; 2004) and theoretically. The respective constitutional 
objective functions and practical conclusions of these investigations vary. Besides 
the equalization of a priori influence (Chamberlain and Rothschild 1981; Felsenthal 
and Machover 1998; Laruelle and Valenciano 2008b; Kaniovski 2008), they consider 
utilitarian welfare maximization (e.g., Beisbart et al. 2005; Barbera and Jackson 2006; 
Beisbart and Bovens 2007; Laruelle and Valenciano 2008b; Koriyama et al. 2012) 
and the avoidance of majoritarian paradoxes like having a Bush majority in the 
2000 Electoral College despite a Gore majority in the population at large (Felsenthal 
and Machover 1999; Kirsch 2007; Feix et al. 2008). Several departures from Penrose's 
independence and equiprobability assumptions have been considered. However, the 
related literature has focused almost entirely on binary political decisions, with no 
scope for bargaining and strategic interaction. 4 

The existing results hence provide useful guidance and arguments in thorny 
debates on the 'right' weight allocation only to the extent that the assemblies in 

3 A particularly notorious case involved the then Polish president and prime minister in the 
negotiations of the Treaty of Lisbon. See, e.g., The Economist (2007, June 14th). 

4 We are aware of the following exceptions only: Laruelle and Valenciano (2008a) suggest a "neutral" 
top-tier voting rule when policy alternatives give rise to a Nash bargaining problem. Le Breton et al. 
(2012) investigate fair voting weights in case of the division of a transferable surplus, i.e., for a simplex 
of policy alternatives. Maaser and Napel (2007; 2012a; 2012b) conduct simulations for a median voter 
environment like the one which we will consider here. 
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question indeed decide on dichotomous exogenous proposals. But many decisions 
involve several shades of grey. Members of the US Electoral College usually have 
binary options, but they face the survivors from a much larger field of initial 
contenders, with partly endogenous final political platforms. The EU Council more 
commonly decides on the level of subsidies, the scope of regulation, the scale of 
financial aid, etc. rather than on having a subsidy, regulation of an industry, or aid 
per se. It seems relevant, therefore, to analyze the fair choice of voting weights (and 
alternative objectives such as utilitarian welfare) for somewhat richer than binary 
{0, 1 {-policy spaces, too. 

This paper considers the equalization of pivot probabilities for a one-dimensional 
convex policy space, i.e., for choices from a real interval. We assume single-peaked 
preferences with random ideal points for all voters, perfect congruence between 
preferences of a constituency's delegate and its median voter, and collective decisions 
which correspond to the Condorcet winner or the core of the game defined by preferences 
and weights of the delegates. The latter can be seen as the equilibrium outcome of 
strategic bargaining (see, e.g., Banks and Duggan 2000). 

In this model, the collective choice equals the weighted median among agents 
whose ideal points themselves are medians from disjoint samples. This is a very 
stylized representation of democratic decision making but yet richer than the binary 
model a la Penrose. The former nests the latter in case that ideal points have a discrete 
two-point distribution. In case of less trivial distributions, little can analytically be 
said about the order statistics of medians from differently sized samples; and next to 
nothing has so far been known about the combinatorial function therefrom which 
corresponds to the respective weighted median of non-identically distributed random 
variables. 

We here derive a general analytical result on the ratio of two delegates' pivot 
probabilities in an infinite increasing chain of collective decision bodies (Theorem 1). 
Each delegate i is characterized by his voting weight W[ and single-peaked preferences 
with a random ideal point A, that has the probability density function fi. In line 
with the veil of ignorance perspective of constitutional design, this random variable 
is a priori assumed to have the same theoretical median for all delegates - say, M = 0. 
It is shown then that, under suitable regularity conditions, a delegate's probability 
of finding his ideal point coincide with the corresponding voting game's Condorcet 
winner is asymptotically proportional to the probability density fi(M) at the theoretical median 
times his assigned voting weight Wj. 

This main analytical result has several practical corollaries for two-tier voting 
systems. In particular, if all individual voters are - behind the constitutional veil of 
ignorance - conceived of as having ideal points that are independent and identically 
distributed (i.i.d.), then the sample median from a constituency C; with n, members 
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has an asymptotically normal distribution whose standard deviation is inversely 
proportional to the square root of n,. The probability density of representative i's ideal 
point at the theoretical median M is hence proportional to ^h~i. It follows that voting 
weights Wi that are chosen to be proportional to y[n~i for all constituencies C; render 
the top-tier pivot probabilities of all representatives proportional to their population 
sizes; this approximately equalizes the expected influence or efficacy of the vote across 
the population. How close one gets to full equalization depends on the considered 
number m of constituencies as well as the population partition at hand. 5 

The optimality of a square root allocation of voting weights is, however, restricted 
to the case of individual ideal points being i.i.d. and the use of a 50%-majority 
threshold. Assuming that voters' ideal points are subject to identical random shocks 
within constituencies - which introduces positive correlation among members of the 
same constituency - implies greater variance of the respective sample medians. The 
latter 's distributions become more and more similar across constituencies if the shock 
distribution H is identical for all constituencies d and its variance o 2 H increases. If 
this measure o 2 H of heterogeneity across constituencies is sufficiently great relative to 
the heterogeneity within each constituency, which is captured by the variance o 2 G of the 
(conditional) ideal point distribution G under a zero shock, then an approximately 
linear weight allocation becomes optimal. 

That a linear weighting rule is optimal in this case follows as a corollary from 
Theorem 1 when top-tier decisions are taken by simple majority But, as made 
precise by Theorem 2, the finding can be extended to supermajority requirements. 
In particular, one can approximate the pivot probabilities of top-tier delegates by the 
Shapley value of the respective weighted voting game when o^/o^ is sufficiently large. 
Even if the number of constituencies m is relatively small, one can hence achieve 
equal representation by finding voting weights such that the resulting Shapley value 
is proportional to population sizes, or as close to being proportional as is feasible. 

The remainder of the paper is organized as follows. In Section 2, we spell out 
our model of two-tier decision making and the institutional design problem. Our 
main result for simple majority rule and m — > oo, as well as its corollary in case that 
individual ideal points are i.i.d. are presented in Section 3. We then explore the effect 
of adding heterogeneity across constituencies to that within, and study asymptotic 
behavior with respect to the 'across'-kind for fixed m in Section 4. We conclude in 
Section 5 and provide proofs of the two theorems in an appendix. 

5 The approximation can be improved if one bases the weight choice on the induced Shapley value 
or, with comparable effects, the Penrose-Banzhaf power index. See Dubey and Shapley (1979), Felsenthal 
and Machover (1998) or Laruelle and Valenciano (2008b) for good overviews on these and other power 
measures. 
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2 Model and Design Problem 



We consider partitions (£ m = {Ci,...,C m } of a large number n of voters into m < n 
disjoint constituencies with n, = \d\ > members each. The preferences of any voter 
I e {l,...,n} = {Jjd are assumed to be single-peaked with ideal point v l in a convex 
one-dimensional policy space XcR, i.e., in a finite or infinite real interval. These ideal 
points are conceived of as realizations of random variables with a priori identical, 
absolutely continuous distributions. A given profile (v 1 , ...,v") of ideal points is 
interpreted as reflecting voter preferences in an abstract left-right spectrum or on 
a specific one-dimensional policy issue (a transfer, an emission standard, a capital 
requirement, etc.). 

A collective decision x* e X on the issue at hand is taken by an assembly or council 
of representatives which consists of one representative from each constituency. 
Without committing to any particular procedure for internal preference aggregation, 
political competition, lobbying or bargaining, it will be assumed that preferences of C/s 
representative coincide with those of its respective median voter, i.e., representative i 
has the random ideal point 

A; = median {v l : I e C/}. (1) 

For simplicity we take all n, as odd numbers, 6 and leave aside agency problems or 
other reasons for why the preferences of a constituency's representative might not be 
congruent or at least sensitive to its median voter. 7 

In the top-tier assembly 'R" 1 , constituency C; has voting weight W\ > 0. Any coalition 
S c {1, . . . , m} of representatives which achieves a combined weight X/ e s w j above 

m 

q m = 0.5Y j w u (2) 

/=i 

i.e., which has a simple majority of total weight, is winning and can pass proposals to 
implement some policy x € X. 

Let • : m be the random permutation of {1, ... , m) that makes Ak-.m the fc-th leftmost 
ideal point among the representatives for any realization of Ai, . . . , A m (that is, A^ : m is 

6 For an even number n„ one could let each of the two middlemost ideal points in C, define the 
representative i's preferences with equal probability. Or one works with the usual definition of the 
median, i.e., their arithmetic mean, and focuses on the probability of event {dx'/dv' > 0} rather than 
the - no longer equivalent - event {x* = v 1 } in what follows. Napel and Widgren (2004) discuss in detail 
how influence in voting procedures can be quantified by outcome sensitivity measures like dx'/dv 1 . 

7 See, e.g., Gerber and Lewis (2004) for empirical evidence on how the median voter and partisan 
pressures jointly explain legislator preferences, and for a short discussion of the related theoretical 
literature. It is important to note that Theorems 1 and 2 will not require (1) to hold - they only assume 
A,'s density to have certain properties. 
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the k-th order statistic). We will disregard the zero probability events of two or more 
constituencies having identical ideal points and define the random variable P by 

P = min{; e {l,...,m}: Y_ l w ^ > ^\ ( 3 ) 

fc=i 

Representative P: m's ideal point, Ap :m , cannot be beaten by any alternative x € X in a 
pairwise vote, i.e., it is in the core of the voting game defined by ideal points Ai, . . . , A m , 
weights W\, ...,w m and quota q m . We assume that the policy x* agreed by lies in 
the core. So x* must equal Ap :m whenever the core is single-valued; then Ap :m actually 
beats every other alternative x e X and is the so-called Condorcet winner in 7i m . In 
order to avoid inessential case distinctions, we assume that < R m agrees on Ap :m also in 
the knife-edge case of the entire interval [Ap_i :m , Ap :m ] being majority-undominated, 
i.e., 8 

x* = A P:m . (4) 

Representative P:m will, therefore, generally be referred to as the pivotal representative 
or the weighted median of K". Banks and Duggan (2000) and Cho and Duggan 
(2009) provide equilibrium analysis of non-cooperative legislative bargaining which 
supports policy outcomes inside or close to the core. 

The event {x* = v 1 } of voter Vs ideal point coinciding with the collective decision 
almost surely entails that sufficiently small perturbations or idiosyncratic shifts of v l 
translate into identical shifts of x* , so that dx* /dv l > 0. Voter I can then meaningfully 
be said to influence, be decisive or pivotal for, or even to determine the collective decision. 
This event has probability 

p l = Pr(x* = v l ), (5) 

which depends on the joint distribution of (v 1 , . . . , v") and the voting weights W\, . . . , w m 
that have been selected for Even though p l will be very small given that the 
set of voters {1, . . . ,n\ is assumed to be large, it would constitute a violation of the 
'one person, one vote' principle if p l /p k differed substantially from unity for any 
l,k G {l,...,n}. 

We will assume throughout our analysis that all voter ideal points are a priori 
identically distributed, in line with adopting a 'veil of ignorance'-perspective when one 
analyzes the efficacy of individual votes or a priori influence of voters. Moreover, 
it is assumed that ideal points are mutually independent across constituencies. We 

8 A sufficient condition for the core to be single-valued is that the vector of weights satisfies Yijes w i ^ 
qtn £ or eacn s c {1,. . . ,m). In the non-generic cases where this is violated, tie-breaking assumptions 
analogous to fn. 6 can be made. Note that no constituency's median voter will have an incentive 
to 'choose' a representative whose preferences differ from her own ones, that is, to misrepresent 
preferences, if x* is determined by (4) (cf. Moulin 1980; Nehring and Puppe 2007). 
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do, however, allow for a specific form of ideal points being dependent within each 
constituency. Namely, we conceive of the ideal point v l of any voter I e C; as the sum 

v l = in + e l (6) 

of a constituency-specific random variable which has distribution H, and a voter- 
specific random variable e l with absolutely continuous distribution G. The voter- 
specific variables e 1 , . . . , e n and constituency shocks . . . , \i m are all taken to be 
mutually independent. If distribution H of \i{ is non-degenerate, it reflects a common 
attitude component of preferences within the disjoint constituencies. G and H are the 
same for all voters I e {1, . . . , n] and constituencies C; e (£ m . This ensures that indeed 
all ideal points v 1 , . . . , v n are identically distributed. G's variance a 2 can be interpreted 
as a measure of heterogeneity within each constituency, reflecting the natural variation 
of political preferences. Similarly, o 2 H is a measure of heterogeneity across constituencies: 
even though it is assumed that opinions in all constituencies vary between left-right, 
religious-secular, etc. in a similar manner, the locations of the respective ranges of 
opinion can differ between constituencies. The corresponding correlation coefficient 
for two voters l,k £ d from the same constituency is o 2 H l(o 2 H + o 2 G ). The case in which 
H is degenerate with o 2 H = involves heterogeneity only within constituencies; the 
latter differ in size but voter ideal points v l are independent and identically distributed 
across the entire population. We regard this as a particularly important benchmark 
and will refer to it as the i.i.d. case. 

With this notation, we can now state our objective of operationalizing the 'one 
person, one vote' principle somewhat more formally. Namely, given a partition (£ m = 
{Ci, . . . ,C m } of n voters into constituencies and distributions G and H which describe 
heterogeneity of individual preferences within and across constituencies, we would 
like to find voting weights W\, . . . , w m such that each voter a priori has an equal chance 
of determining the collective decision x* e X- that is, such that 

V — = \joralll l k£{\ l ... l n}. (7) 

For most combinations of (£ m , G, and H, condition (7) cannot be satisfied by any 
weight vector (wi, . . . ,w m ). This is due to the discrete nature of weighted voting. 9 
So the problem would need to be formulated more precisely as that of minimizing 
a specific notion of distance between the probability vector {p l ,...,p n ) induced by 
W\,...,w m and (1/n, . . . , 1/ri) £ R". 

9 For instance, there are only 117 structurally different weighted voting games with m = 5 
constituencies even if all majority thresholds between 0% and 100% are permitted. This number 
(related to Dedekind's problem in discrete mathematics) grows very fast, but the set of distinct feasible 
influence distributions remains finite. 
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The actual concern, however, is not with finding the respective optimal solution 
to such a (non-trivial) discrete minimization problem for a particular partition (£ m 
and specific distributions G and H. Rather, our objective is to find a simple function 
which maps n\,...,n m to weights W\,...,w m that induce p j /p k « 1 for all I and 
k, that is, which approximately satisfy the 'one person, one vote' criterion, for 
arbitrary non-pathological partitions (£ m . 10 Preferably, qualitative information on 
heterogeneity within and across constituencies should suffice in guiding possible 
design recommendations. 

The stated assumptions imply that, when considering any given realization of 
ideal points v l and v k are conditionally independent if /, k e C, for some i. They are in 
any case identically distributed. In particular, p l = p k holds for I, k e C, irrespective of 
which G, H, and voting weights W\, . . . , w m are considered, and it must be the case that 
if I G d then 

PnV = A,) = -. (8) 

Hi 

So, an individual voter's probability to be her constituency's median and to determine 
Ai is inversely proportional to constituency C/s population size. 

The events {v l = A,} and {x* = A,} are independent given our statistical assumptions. 
(Note that the first event only entails information about the identity of C/s median, not 
its location.) It follows that the probability p l for an individual voter I e C, influencing 
the collective decision x* is times the probability of event {x* = A,} or, equivalently, 
of {P: m = i}. Letting 

m{ ( R m ) = Yr{P:m = i) (9) 

denote the probability of constituency C, 's representative being pivotal in H" 1 (that is, of 
Ai being the respective Condorcet winner in case of generic weights), our institutional 
design objective hence consists of solving the following 

Problem of Equal Representation: 

Find a simple mapping from constituency sizes n 1 ,...,n m to voting weights 
W\, . . . ,w m for the representatives in R™ such that 

~ - foralli,ie{l,...,m}. (10) 

One might conjecture that, if m is large enough and the weight distribution is not overly 
skewed, voting weight Wj should translate linearly into representative i's influence 



10 By pathological partitions we, for instance, mean ones where constituency sizes n, increase 
exponentially in i. - There is no need to specify exactly which functions are "simple" enough. Power 
laws, i.e., choosing W{ = fin" for some a, f> e K, certainly qualify and turn out to constitute a sufficiently 
rich class of mappings. 
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TiiCR" 1 ). 11 But the distribution of the respective ideal points A, will certainly play a 
role, too, and so the solution to this problem will depend on how heterogeneity of 
individual preferences within and across constituencies relate. 

Note that, if the representatives' ideal points Ai, . . . ,A m were not only mutually 
independent but also had identical distributions F, = Fj for all i,j e {1, . . . , m\, then 
all orderings of K\, . . . , A m would a priori be equally likely In this case, TiiCR™) would 
coincide with z's Shapley value <pi(v), where v is the characteristic function of the m- 
player cooperative game in which the worth v(S) of a coalition S c {1, . . . ,m} is 1 if 
Tijes w j > 1 m an d otherwise, and 12 

,w \ V \S\l ■ (m - \S\ - 1)1 

(pi(v)= > [v(S U {i}) - v(S)]. (11) 

' m! 

Sc{l,...,OT}\{i) 

The way to solve the problem of equal representation would then simply be to 
search for a weighted voting game which induces a Shapley value proportional to 
(ni,...,n m ). Unfortunately, if we assume that distribution G, which generates the 
private component e l in individual ideal points v 1 , is non-degenerate then (1) implies 
that Fj = Fj if and only if = tij. The case in which this holds for all i,j e {1, . . . , m\ 
is precisely the one in which equal representation is trivial, i.e., achieved by giving 
all representatives identical weights because ti\ = ... = n m . 13 In particular, F, second- 
order stochastically dominates distribution Fj (or Fj is a mean-preserving spread of 
Fi) if ft, > tij because the sample median of ft; independent draws from G has smaller 
variance than that of just rij < ft, draws; and the respective draw from H adds identical 

11 Asymptotic proportionality between weights and voting power has first been investigated by 
Penrose (1952) in the context of binary alternatives. Related formal results by Neyman (1982), Lindner 
and Machover (2004), Snyder et al. (2005), Jelnov and Tauman (2012) and Theorem 1 below suppose 
that the relative weight of any given voter becomes negligible as more and more voters are added. The 
case when relative weights of a few large voters fail to vanish as m — > oo - giving rise to oceanic games 
and typically non-proportionality - has been treated by Shapiro and Shapley (1978) and Dubey and 
Shapley (1979). The limit behavior of pivot probabilities for uniform weights (as at the bottom tier) has 
been studied in more complex models than Penrose's by Chamberlain and Rothschild (1981), Myerson 
(2000), German et al. (2002), and Kaniovski (2008). 

12 See Shapley (1953). For so-called simple games - in which v(S) e (0,1} for all S c N, v(0) = 0, 
v(N) = 1, and v(S) = 1 => v(T) = 1 if S c T - the Shapley value is also referred to as the Shapley-Shubik 
power index, following the first suggestion of using <ft in order to evaluate power in voting bodies by 
Shapley and Shubik (1954). We write v = [q m ;w\, . . . ,w m ] if v is defined by the weighted voting rule 
[q m ;wi,...,w m ]. 

13 We remark that re-partitioning the population into constituencies of equal size - i.e., appropriate 
redistricting - is, of course, a trivial possibility for altogether evading the considered problem. Our 
analysis is concerned with those cases where historical, geographical, cultural, and other reasons 
exogenously have defined a partition (£ m which cannot easily be changed. See Coate and Knight 
(2007) on socially optimal districting and Gul and Pesendorfer (2010) on strategic issues which arise 
for redistricting. We also disregard another relevant strategic feature of two-tier voting: incentives to 
allocate limited campaign resources to the constituencies. We refer the reader to Stromberg (2008). 
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variance to both A, and Ay. 

In the i.i.d. benchmark case, in which cr^ = and the only acknowledged differences 
between two voters from distinct constituencies are the numbers of their fellow 
constituents, one can be more specific than stochastic dominance. Namely, a standard 
result about the sample median of i.i.d. random variables is: 

Lemma 1. Let X\, ...,X S be i.i.d. random variables with median M and a density f that 
is continuous at M with f(M) > 0. Then random variable Y = median {Xi, .. .,X S } is 
asymptotically (M, a 2 )-normally-distributed with 

° 2 = 7mm <12) 

i.e., the re-scaled sample median 2/(M) ^[Y - M] ofX\, . . . ,X S converges in distribution to 
N(0, 1) as s — > oo. 

See, e.g., Arnold et al. (1992, Theorem 8.5.1) for a proof. It follows that in the i.i.d. case, 
ideal points Ai, . . . , A m in assembly are (approximately) normally distributed 14 with 
identical means but standard deviations that are inversely proportional to the square root 
of the respective constituency sizes. So, in the i.i.d. case, rather than all orderings 
of Ai,...,A m being equally likely (<ft(v)'s implicit assertion), the representative of a 
constituency C; which is four times larger than constituency Cy has twice the chances 
to find itself in the middle. (Recall that normal density at the mean and median is 
inversely proportional to standard deviation.) Then weights that are proportional to 
population sizes, or weights such that <p(v) is, would give representatives of large 
constituencies more a priori influence than is due. 15 

Before we make this reasoning precise in the following section, let us iterate that the 
considered median voter model of equal representation in two-tier decision making 
is an admittedly big simplification. Many collective decisions involve more than 
just a single dimension in which voter preferences differ. Even if the assumption 
of a one-dimensional, say, left-right policy space and single-peaked preferences was 
granted, systematic abstention of certain social groups could drive a wedge between 

14 This approximation is very good already for rather moderate sample sizes. If, e.g., individual 
ideal points v 1 are standard uniformly distributed, i.e., e' ~ U[0, 1] and zi; = 0, then A, is beta distributed 
with parameters a = b = (n, + l)/2. The corresponding beta and normal density functions can be 
regarded as identical for all practical purposes if n, > 100. - Note that Lemma 1 is useful also in case of 
a non-degenerate distribution H of fx,-: it establishes that the precise distribution G of individual shocks 
does not matter for A,'s distribution F,; only g(M), the sufficiently great and H do. 

15 We are unaware of any systematic empirical evidence for or against the claim that representatives 
from larger constituencies tend to be located more centrally in the relevant policy space. This 
theoretical prediction is testable in principle but rests on the two assumptions of aggregate preferences 
being determined by the median individual and individual ideal points being i.i.d. In view of the 
rapid transition towards indistinguishable distributions of representative ideal points when the i.i.d. 
assumption is given up (see Section 4), the claim is bound to be difficult to confirm in practice. 



11 



the median voter's and the median citizen's preferences. 16 We ignore that voting might 
involve private information about some state variable (Feddersen and Pesendorfer 
1996, 1997; Bouton and Castanheira 2012), and typical agency problems connected 
to imperfect monitoring and infrequent delegate selections (for instance, national 
elections of the EU Council's members take place every four to five years). Empirical 
evidence highlights that a representative may take positions that differ significantly 
from his district's median when voter preferences within that district are sufficiently 
heterogeneous (see, e.g., Gerber and Lewis 2004). Still, we take it that the best intuitions 
about fairness are captured by simplifying thought experiments of a Veil of ignorance' 
kind. The analysis of the described stylized world - no friction, particularly well- 
behaved preferences which are a priori identical for all - is useful in this way. It 
shows the limitations of and justifications for the simple intuition that weights should 
be proportional to the number of represented constituents, in a framework that goes 
beyond the binary world analyzed by Penrose (1946) and others. 



3 Egalitarian Voting Weights for Many Constituencies 

We will in this section consider situations in which the number m of constituencies is 
suitably large. Very few tangible results exist on the distribution of order statistics, 
like the median, from differently distributed random variables (the representative ideal 
points Ai, . . . , A m ). And almost nothing seems to be known about the respective 
distribution of a weighted median, which is taken to define the collective decision x* £ X 
in our model. It turns out to be possible, nevertheless, to characterize the probability 
of some A; being the weighted median, i.e., the pivot probability 7Zj(7? m ), asm^oo. 

We conceive of K l c 'R 2 c 'R 3 c . . . as an infinite chain of assemblies in which more 
and more constituencies i e IN have a representative with a voting weight W[ > and 
a random ideal point A, with absolutely continuous distribution F,. Some technical 
requirements will be imposed on the corresponding density f, but it does not matter 
if A; is defined by (1) and corresponds to the median of some set of other random ideal 
points like {v'}/ eC; ; it could, for instance, be the average of some ideal points (such as 
those of members of a coalition government or oligarchy) or that of a constituency 
dictator. So while the problem of equal representation which we stated in Section 3 
is the key motivation for investigating pivot probabilities n\^R m ), . . .,Tc m ( < R m ), the 
following characterization of their limiting behavior has more general applicability. 17 

16 Parts of the population may be without suffrage (minors, aliens, or prisoners). Penrose (1946, 
p. 57) referred to "the square root of the number of people on each nation's voting list" but the political 
discussion of voting weights in the EU Council has almost exclusively referred to population figures. We 
here follow this line and use the terms citizens and voters interchangeably. 

17 In particular, for given F\, . . .,F m , nCR m ) amounts to a specific quasivalue or random order value for 
simple games. See, e.g., Monderer and Samet (2002). 
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We will return to the issue of designing egalitarian two-tier voting systems after 
considering assemblies < R m with rather arbitrary ideal point distributions Fi, ...,F m 
and weighted voting rules [q m ; W\, ... r w m ]. 

For weight sequences {w m } m eK and associated weighted voting games [q m ; W\, . . . , w m ] 
in which nobody's relative voting weight is bounded away from zero, the pivot 
probability nf(R m ) of any given representative i £ N will converge to zero asm-> oo. 18 
Still, ni^R m )l'n.j{'R m ) need not converge. This is illustrated by the sequence {w m } me ^ 
with 

w m = (1,2, ... ,2) G M m . (13) 

Representative 1 is either a dummy player with n\^R m ) = or, supposing that the ideal 
point distributions F\, . . . ,F m are identical, n i ( < R m ) = - for all i = 1, . . . ,m depending 
on whether m is odd or even. So Uii^R" 1 ) I Ti2^R m ) alternates between and 1. More 
complicated examples of non-convergence can be constructed, e.g., by having {w m } me n 
oscillate in a suitable fashion. 

We rule out such possibilities by imposing a weak form of replica structure on the 
considered weights W\, w 2 , u>3, . . . and ideal point distributions F\, F 2 , F 3 ,... Specifically, 
we require that all representatives i £ N belong to one of an arbitrary but finite number 
r of representative types 6 £ {1, . . . , r}. All representatives of the same type 6 have an 
identical weight w e and distribution F e . And, avoiding somewhat contrived situations 
like in (13), we restrict attention to chains c 'R 2 c 'R 3 c . . . in which each type 6 
maintains a non-vanishing share of representatives asm^oo. 

The key requirements for the following result are that (i) F\, F 2 , F 3 ... have identical 
median M and that (ii) each distribution F; has a density fi which is locally continuous 
and positive at M. In order to allow the application of a powerful uniform convergence 
result for the Shapley value by Neyman (1982), continuity will be strengthened to the 
requirement that each density f/s variation at its median, \fi(x) - fi(M)\, can locally 
be bounded by a quadratic function cx 2 . This bound follows readily if fi is C 2 like 
the normal density functions singled out by Lemma 1, and could be relaxed to cx a 
for any a > 1 if one used somewhat less round constants in the proof. Moreover, an 
unpublished extension by Abraham Neyman of his 1982 result could be employed 
in order to make do with just (ii). Details on this and the proof are presented in 
Appendix A. 

Theorem 1. Consider an infinite chain K l c % 2 c 'R 3 c . . . of assemblies which involves 
a finite number r of representative types, i.e., there exists a mapping t: N — > {1, ...,r} 
such that t(;) = 6 implies that Aj has density fe and Wj = Wq > 0. 19 Let the share 

18 Concerning the Shapley value ^(z;), which equals nCR m ) if F\, ...,F„, are identical, Neyman (1982, 
Lemma 8.2) has established that (pj(v) < Awi/ ~)L]>i w ] for v = \q;w\, . ..,w m ] with w\ > . . . > w m > 0, 
Y!j= i = 1, and 2w\ < q < 1 - 1w\. 

19 We presume w.l.o.g. that t(z) = i for i e {1,. .., r). 
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of each type be bounded away from zero, i.e., there exist /3 > and m° e N such that 
p e (m) = | {k e {1, . . . , m) : z{k) = 6} \ /m > p > Ofor all m > m°. If for each € {1, . . . , r} the 
distribution F e has median M and its density f e satisfies fe(M) > with \ fe(x) - /e(M)| < cx 2 
on a non-empty interval [M - £i,M + t{\for some c > then for Wj > 



The key observation behind limit result (14) and its corollaries is that, as m grows 
large, the pivotal member of is most likely found very close to the common median 
M of ideal point distributions F 1 ,...,F m . Pivotality at location x e X requires that less 
than half the total weight of 7? m 's members is located in (-oo, x) and less than half the 
total weight is found in (x, oo). In expectation, this occurs exactly at x = M and, by 
Hoeffding's inequality, the probability for the realized weighted median in 'R m to fall 
outside an £ -neighborhood of M approaches zero exponentially fast as m — > oo. 

One can, therefore, restrict attention to an arbitrarily small interval [-£,£'] C 
[—Ei, £i] for sufficiently large m if w.l.o.g. M = 0. Since the densities /i(x), . . . ,f m (x) 
satisfy the mentioned kind of continuity, they can suitably be approximated by upper 
and lower bounds on this interval. Moreover, when we condition on the respective 
events {A ; £ [-£, e]}, the corresponding bounds are almost identical for any ; = 1, . . . , m 
when m is sufficiently large. This makes all orderings of those representatives 
with ideal points in [-£, e] conditionally equiprobable in very good approximation. 
Representative i's respective conditional pivot probability, therefore, corresponds 
to i's Shapley value in a 'subgame' which involves only the representatives ; with 
realizations Aj e [-£', e]. It is possible to apply the uniform convergence result for the 
Shapley value proven by Neyman (1982) to each of these subgames. In the final step of 
the proof, it then remains to exploit that the probability of the condition {A; £ [-£, e]} 
being true becomes proportional to A,'s density at when e | 0. 20 

Theorem 1 provides a rather general answer to the posed problem of equal 
representation in the case that many constituencies are involved. In particular, 
comparison of (10) and (14) immediately yields 

Corollary l.Ifm is sufficiently large then choosing 



20 At an intuitive level, one may even directly think of the limit case e = 0: if one conditions 
on representative i's ideal point being located at x = M = 0, i.e., {A, = 0}, then each representative 
j + i is equally likely found to i's left or right (because Fy(M) = \). In this case, i's conditional 
pivot probability equals i's Penrose-Banzhaf power index, which, like the Shapley value, becomes 
proportional to (w\, ... ,w m ) for the replica-like weight sequences that we consider. (See Lindner and 
Machover 2004 and Lindner and Owen 2007 on the corresponding limit result.) 



Hm m(K m ) = iVifjM) 



(14) 




(15) 
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achieves approximately equal representation (as formalized by condition (10)) if the technical 
conditions of Theorem 1 are verified by f\, . . . ,f m . 

If m <<c oo, the approximation of the conditional pivot probabilities for ideal points 
in a neighborhood of the common median M, which is obtained by (a) considering 
the limit case of orderings being conditionally equiprobable and (b) by applying 
Neyman's limit result for the Shapley value, need not be very good. The latter source 
of imprecision can be avoided by computing the Shapley value <p(v) for the simple 
game v = [q m ; W\, w m ] which is defined by representatives' weights and simple 
majority rule as described in Section 2. The suggestion in Corollary 1 can hence be 
improved somewhat if (15) is replaced by 

T7T7T/ • • • / r Zk\ ■ ( 16 ) 
fl{M) Jm{M)J 

We conclude this section by specifically considering the benchmark i.i.d. case, in 
which the ideal points v 1 , . . . , v n correspond just to voter-specific random variables 
e 1 , . . . , e n that are independent and identically distributed with a suitable probability 
density function g, and where A,- corresponds to the median ideal point in Cf. In this 
case, the ideal points Ai, . . . , A m in assembly "R" 1 are asymptotically normally distributed 
by Lemma 1 with respective densities that satisfy the quadratic bound condition of 
Theorem 1 and 

fi(M) = 1 = ^ > 0. (17) 

n ; [2g(M)p 

Combining (17) and Corollary 1 we obtain: 

Corollary 2 (Square root rule). If the ideal points of all voters are i.i.d., representative i's 
ideal point equals the median voter's ideal point in constituency dfor all i e {1, . . . , m}, and 
m is sufficiently large then 

(wi,...,w m ) oc (VnT,..., V^) (18) 

or, better, 

{wi, w m ) such that (p(q m ; w x ,..., w m ) oc ( ^n[, ^/n^j (19) 
achieves approximately equal representation. 
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4 Heterogeneity within vs. across constituencies 



Corollary 2 derived a square root rule similar to that of Penrose (1946) for the i.i.d. case, 21 
that is, for a degenerate distribution H of the constituency-specific ^-components 
of individual ideal points v l = \i[ + e l . We now investigate the robustness of this 
rule regarding the degree of preference affiliation within each constituency. Non- 
degenerate shocks jUj imply positive correlation within each constituency and give 
rise to polarization of preferences along constituency lines, which is measured by the 
ratio a 2 H la 2 G . 21 It turns out that for sufficiently strong polarization, a linear weight 
allocation rule quickly performs better than strictly concave mappings 

This is analytically seen most easily for the case in which all the involved 
distributions are normal. First, let all e l be distributed normally with zero mean and 
variance o 2 G . Lemma 1 then implies that the median of {e'J/eC; is approximately normally 
distributed with zero mean and variance 7icr 2 ; /(2n ! ). Second, let the constituency- 
specific preference component \i{ be normally distributed with zero mean and variance 
ojj. Constituency C/'s aggregate ideal point A, - the sum of two independent (approx- 
imately) normally distributed random variables - then also has an approximately 
normal distribution. Namely, 



Considering the corresponding densities at M = for two representatives i and / yields 



This ratio quickly approaches 1 as o 2 H — > oo, or if o 2 ^ > and — > oo. Corollary 1 
then calls for (w\, w m ) oc (m, . . . , n m ). 

We pointed out in Section 3 that heterogeneity within each constituency will 
always give rise to different distributions of the sample medians when n, + Uj. But 
the differences become small and no longer matter for pivotality in 7? m when the 
heterogeneity across constituencies is sufficiently great. This is illustrated by Figure 1. 

21 Note that the Penrose square root rule does not refer to weights but top-tier pivot probabilities, which 
equal the Penrose-Banzhaf power index of the representatives in Penrose's binomial voting model (cf . 
fn. 5). 

22 The basic features of polarization are according to Esteban and Ray (1994, p. 824): (i) a high 
degree of homogeneity within groups, (ii) a high degree of heterogeneity across groups, and (iii) a small 
number of significantly sized groups. Ratio cr^/cr 2 , serves as a simple measure of polarization of ideal 
points v 1 , . . . , v" here, where groups are given exogenously. Esteban and Ray characterize polarization 
measures for the general case without an exogenous partition of the population. 




(20) 



MO) 




(21) 
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Figure 1: Densities of A; and Aj when n, = 4tt; and (a) {ii = y.j = or (b) \ii ~ U[-6a, 6a] 

It depicts the density functions of ideal points A; and A ; when C ; is four times larger 
than constituency Cj, so that the standard deviation a, of the median of {e ] )i eCi is half 
the standard deviation Oj = o of the median of {e'}/ 6 Cy Panel (a) shows the densities 
when a 2 H = (or when we condition on \ii = y.j = 0); panel (b) depicts the case when 
Hi, H] ~ U[— 6cr, 6a]. The densities of A ; and A ; in panel (b) are very hard to distinguish 
in a neighborhood of the median M = 0. This neighborhood's size increases in crL, 
and it coincides with the relevant policy range in which the Condorcet winner of 
is most likely located under simple majority rule. 

Recall that the uniform distribution on [a, b] has a variance of (b-a) 2 /12. So panel (b) 
shows a situation with cr^ = 12cr 2 . If we assume, as above, that all e l are normal with 
variance a 2 G then cr ; = o corresponds to o 2 G = (2nj/n) ■ o 2 . Panel (b) hence reflects a 
preference dissimilarity or polarization ratio of o 2 H jo 2 G = 6n/tij, which is tiny when 
one thinks of typical real-world population figures tij. 

This suggests that the phase transition between optimality of a square root rule 
to optimality of a linear rule can be very fast. Figure 2 demonstrates this when a 
population partition corresponding to the current European Union with 27 member 
states (EU27) is considered. The dashed line illustrates the (interpolated) optimal 
coefficients a* as a function of o 2 H lo 2 G when we search for the best rule in the class 



(w 1 ,...,w m ) oc (n t a ,... 



(22) 



for a G {0,0.01, . . . , 1.99,2}, 3 the solid line analogously depicts a* when one searches 



23 Specifically, we consider c ~ U[-0.5,0.5] and ,u ; ~ N(0, ajj) with < a 2 ^ < 10~ 6 and determine 
estimates of the pivot probabilities TijCR 27 ) which are induced by a given value of a via Monte Carlo 
simulation. The considered objective is to minimize || ■ ||i-distance between individual pivot probabilities 
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Figure 2: Best coefficient a for direct (dashed line) and Shapley value-based allocation rules 
(solid line) with n\, . . . ,njj defined by EU27 population data 

within the class of Shapley value-based rules 

{w\, ... , w m ) such that (p(q m ; W\, ... , w m ) oc (m a , ... , n m a ) . (23) 

Optimality of the square root rule can be seen to break down very quickly; already 
small degrees of preference dissimilarity across constituencies render a linear rule 
based on the Shapley value optimal. 24 This makes it possible to base design 
recommendations on rather qualitative assessments of polarization, i.e., it is not 
necessary to obtain precise estimates of o^/a^ in applications. 

Note that Figure 2 considers real EU population data but counterf actually assumes 
Council decisions to be taken by a simple majority. However, Figure 1 suggests that a 
majority threshold of q = 50% may not be a critical condition for optimality of a linear 
Shapley rule, provided that cr^ > 0. We will make this claim precise in the remainder 
of the section. 

When we presume that assembly uses the 50%-majority threshold defined in 
equation (2), the representative P : m defined by (3) can be considered as the pivotal 
member of < Fd n without much qualification. We can generalize our model and consider 
arbitrary relative majority thresholds q £ [0.5; 1) if we are willing to accept a weaker 
notion of pivotality. The complication is that the set of policy options that are q- 
majority undominated is no longer generically unique when q > 0.5; supermajority 
rules induce cores which typically consist of entire intervals. We can, nevertheless, 

and the egalitarian ideal of (1/n, . . . , 1/n) e R". 

24 That a* fails to converge to 1 when the simpler weight-based rule in (22) is concerned attests to the 
combinatorial nature of weighted voting, which cannot be totally ignored even for m = 27. 
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generalize the quota definition in (2) to 



m 

<r = *£uv, (24) 

/'=1 

for q G [0.5; 1) and consider the representative P: m defined by (3) to be pivotal. This 
may be justified most easily by supposing that a legislative status quo x° « oo exists 
and that formation of a winning coalition proceeds qualitatively in the same fashion 
as is sometimes assumed in order to motivate the Shapley value: coalition formation 
starts with the most enthusiastic supporters of change on the left, iteratively includes 
representatives further to the right, and gives all bargaining power to the first - and 
least enthusiastic - member who brings about the required supermajority. 25 

Denote an ra-member assembly ^ m which uses the relative decision quota q 6 
[0.5; 1) and chooses policy x* = A P:m as defined by (3)-(4) and (24) by <R m *. If q > 0.5, 
the corresponding pivot probabilities ni( < R m,q ) and nj( < R m,q ) of representatives i and / 
in general/az7 to exhibit the limit behavior with respect to m which is characterized in 
Theorem 1. So Corollaries 1 and 2 do not apply when q > 0.5. 26 

However, a second asymptotic relationship applies for q = 0.5 as well as 
arbitrary q 6 (0.5; 1), for arbitrary fixed ra, and without need for any kind of replica 
structure. Specifically, we can consider the situation in which given non-degenerate 
shock variables . . . ,\i mi whose common probability density h reflects preference 
heterogeneity across constituencies, are scaled by a non-negative factor t. Individual 
ideal points are then given by 

v l = f • Hi + e l (25) 
for t > 0. The corresponding ideal point of representative i from constituency C; is 

A; = t • \l{ + €i (26) 

with 

ii = median {e l :le d] (27) 

where we maintain the assumption that all \i\ and e l are mutually independent and 
respectively identically distributed for ie {1, . . . , m] and I e {1, . . . ,n}. 

The i.i.d. case amounts to t = 0; and considering a large parameter t corresponds 

^Justifications for attributing most or all influence in ?? m to representative P : m in the supermajority 
case date back to Black (1948). Distance-dependent costs of policy reform, a strategic external agenda 
setter, or the need of assembly 7? m to bargain with outsiders can motivate a focus on the core's extreme 
points. Status quo x° might also vary randomly on X such that it lies to the left or right of the core 
equiprobably (with 7z,(7? m ) then being i's pivot probability conditional on policy change). 

26 One can check numerically that when one considers rules (w\, ... , w m ) oc {n\ a , ... , n m a ), the optimal 
coefficient a*(q) for the i.i.d. case, where a*(0.5) = 0.5, increases non-linearly in q. 
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to investigating an electorate which is highly polarized along constituency lines. If 
we denote the pivot probability of representative i by Ui^R m ' q,t ) and the Shapley value 
of the weighted voting game v = [q m ; W\, w m ] with q m defined by (24) as <p(v), the 
following holds: 

Theorem 2. Consider an assembly < R m 'i with an arbitrary number m of constituencies and the 
relative decision quota q £ [0.5; 1). For each i £ {1, . . . , m] let A, = f • \ii + e,, where \i m 
and e\,...,e m are all mutually independent random variables, Si, ... ,S m have finite means 
and variances, and \i\,...,\i m have an identical bounded density. Then 

v 7 tt -(fl m *o <m 

The proof is provided in Appendix B and formalizes that the respective orderings of 
representatives which are induced by X\, . . . , A m and \>yt-\i\,...,t-\i m tend to coincide 
when t is large. 27 The theorem does not presume that e ; satisfies (27); the limit (28) 
applies also if A, is determined, e.g., by an oligarchy instead of the median voter of 
d- It is, moreover, worth noting that Theorem 2 does not impose any conditions 
like Theorem 1 on densities gi,. ■ ■ ,g m or voting weights W\, . . . , w m in assembly < RP 1 ^. 
The Shapley value (p(v) automatically takes care of any combinatorial particularities 
associated with W\, . . . , w m ; and the convolution with t ■ (j./s bounded density, j/z (j V 
is sufficient to 'regularize' any (even non-continuous) distribution G, of e ( . Applying 
Theorem 2 to the specific context of two-tier voting, we can conclude: 

Corollary 3 (Linear Shapley rule). If the ideal points of voters are the sum of an individual 
component e l which is i.i.d. for all I e {l,...,n} and a constituency-specific component \i{ 
which is i.i.d. for all i e {1, . . . , m], representative i's ideal point equals the median voter's ideal 
point in constituency dfor all i £ {1, . . . , m}, and p./s variance is sufficiently great relative to 
that of d then 

(wi, ... , w m ) such that (p(q m ; W\, ... , w m ) oc (m, n m ) (29) 

achieves approximately equal representation for any given relative decision quota q £ [0.5; 1). 

The indirect representation of bottom-tier voters which is achieved by this linear 
Shapley rule can fail to be reasonably egalitarian when m is small, the distribution of 
constituency sizes is extremely skewed or has small variance, or when q is close to 
1. This is because the so-called inverse problem of finding weights which induce the 



27 The density-driven intuition for Theorem 2 which is suggested by Figure 1(b) can also be made 
precise: under the additional assumption that the density h of the shock terms /i, is Lipschitz continuous, 
the density functions of X\, . . . , A m converge uniformly to that of t ■ A proof is available from the 
authors. 
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desired Shapley value often fails to have a good solution in these cases. 28 Still, provided 
that the considered heterogeneity across constituencies is sufficiently bigger than the 
heterogeneity within, the indirect representation achieved by (29) is as egalitarian as 
possible. 

Whether Corollary 3 for the case of noticeable preference affiliation within 
constituencies or Corollarly 2 for the i.i.d. case provides better guidance for designing 
a fair two-tier voting system in practice is hard to say. Some preference homogeneity 
within and dissimilarity across constituencies seems plausible - whether as the result 
of a sorting process (Voting with one's feet') a la Tiebout (1956), due to cultural 
uniformity fostered by geographical proximity and local policies (see Alesina and 
Spolaore 2003), or for other reasons. If constituencies correspond to entire nations, as 
in case of the EU Council or ECB Governing Council, citizens of a given constituency 
typically share more historical experience, traditions, language, communication etc. 
within constituencies than across. (This plausibly is the key practical reason for why 
the issue of population size differences cannot trivially be resolved by redistricting 
in the first place.) However, the collective decisions that are taken by the top-tier 
assembly might be primarily about issues where opinions range over the same liberal- 
conservative, markets-government, dove-hawk, etc. spectrum in all constituencies. 
Moreover, there might be normative reasons outside the scope of our analysis for 
pretending that cr^ = even if it is not when one designs a presumably long-lasting, 
fair constitution. We, therefore, avoid any specific recommendations here for, say, new 
voting rules in the EU Council but warn that the i.i.d. presumption is more knife-edge 
and, therefore, seems to require special motivation. 29 



5 Concluding remarks 

This paper has developed two limit results for the probability of being a decisive voter 
in order to address the issue of egalitarian representation of individuals in a two-tier 
voting system, such as the EU Council or the US Electoral College. Our concern was 
the equalization of the indirect influence which bottom-tier voters can be expected to 

28 This is easily seen, e.g., by considering constituencies of different sizes n\, . . . , n m and a relative 
quota q « 1 which essentially imposes unanimity rule; or by considering just m = 3 constituencies, 
so that the only feasible Shapley values are - up to isomorphisms - (1/3,1/3,1/3), (2/3,1/6,1/6) and 
(1, 0, 0). A new approach to solving the inverse problem exactly by using integer linear programming has 
been proposed by Kurz (2012). 

29 A third alternative, inspired by the suggestion of "flexible" democratic mechanisms in other 
contexts (see Gersbach 2005, 2009), would be to specify different weighted voting rules for different 
policy domains. In some policy areas, such as competition policy small or unstable between- 
constituency differences may call for square root weights; while fair decision making in other policy 
domains, such as agriculture or fisheries - with heterogenous shares of farmable land and some 
members landlocked, others islands - could involve linear weights. 
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have on the collective decision in case of a one-dimensional convex policy space. The 
square root rule has played a prominent role in the related political discussion in the 
EU as well as the scientific discussion of binary policy environments. It was suggested 
to apply also more generally by the simulations of Maaser and Napel (2007). 

We now provide it with a sound analytical foundation in a median voter environment 
(Corollary 2). However, the somewhat counterintuitive square root rule turns out to 
have limited robustness. It does not extend to supermajority rules; it does badly in 
case of positive correlation of the ideal points at the constituency level. A linear rule 
quickly performs better and becomes optimal for sufficiently strong similarity within 
constituencies. 

This dichotomy is, in some sense, not very surprising. The extensive literature 
on optimal voting weight allocations for binary policy alternatives has, for various 
objective functions, brought about either a square root or a linear rule (with few 
exceptions). Square root rules typically follow from far-reaching homogeneity and 
independence assumptions, while a linear rule is called for in case of dependence 
and significant across-constituency heterogeneity. For instance, Kirsch (2007) finds 
square root weights to minimize the extent of disagreement between the council's 
binary decision and the popular vote with independent 'yes' or 'no' votes, but a linear 
rule if a sufficiently strong "collective bias" of the voters within each constituency 
is introduced. The utilitarian design objective of Barbera and Jackson (2006) calls 
for square root weights in their "fixed-size-block model", while they derive a linear 
rule in a "fixed-number-of-blocks model" which divides each constituency into the 
same number of blocks of identical voters. 30 Beisbart and Bovens (2007) come to a 
very similar conclusion when trying to maximize welfare in another binary model: 
with i.i.d. utility parameters and simple majority rule, square root weights maximize 
total expected utility (and equalize it across citizens). But if an individual's utility is 
perfectly correlated with more other individuals the larger their constituency, then the 
square root rule quickly makes way for a proportional one. 

So, using a very different and flexible framework, the corollaries derived from 
two new limit results for interval policy spaces echo a pattern that has emerged also 
in the literature on binary two-tier voting systems. As originally argued by Penrose 
(1946), ex ante independent and identical voters call for a voting weight allocation 
rule based on the square root of population sizes. However, sufficiently strong 
dissimilarity between constituencies renders most people's basic intuition correct - 
plain proportionality does the trick. 

30 The fixed-size-block model conceives of constituencies as consisting of many equally sized blocks 
of individuals whose preferences are perfectly correlated within a block and independent across blocks. 
The existence of such blocks - like those in the fixed-number-of-blocks model - would in our setup 
imply that, generically no individual voter is ever pivotal in his constituency. Still, Theorems 1 and 2 
could be used to characterize pivot probabilities of the respective representatives. 
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Appendices 



A Proof of Theorem 1 

Theorem 1. Consider an infinite chain H 1 c % 2 c 'R 3 c . . . of assemblies which involves a 
finite number r of representative types, i.e., there exists a mapping x : N — > {1, . . . , r} such that 
t(;) = 6 implies that Aj has density f e and Wj = w e > 0. Let the share of each type be bounded 
away from zero, i.e., there exist ft > and m° e N such that /3e(m) = \{k e {1, . . . , m) : z(k) = 
6}\/m > j6 > for all m > m°. If for each e {!,... ,r} the distribution F e has median M 
and its density f e satisfies fe(M) > with \fe(x) - /e(M)| < cx 2 on a non-empty interval 
[M - E\, M + Ci] for some c > then for Wj > 

m(W) = Wjfm 



A.l Overview 

Let us first give an overview of the five steps of the proof. In Step 1, we define 
a particular neighborhood I m of the expected location of the weighted median of 
Ai,...,A m . This essential interval shrinks to {M} as m — > oo. It is constructed such 
that the probabilities p Gl p e , and p e of a type-0 representative's ideal point falling 
inside I m , inside I m 's left half, or inside I m 's right half, respectively, can suitably be 
bounded. Moreover, we decompose the deterministic total number m e = fie{m) ■ m 
of type-0 representatives in assembly K" into the random numbers \ Q , k e , and k" g 
of delegates with ideal points to I m 's left, inside I m , and to I m 's right. Knowing the 
respective vector k = (^i, . . . , %,k r ,k^.) will be sufficient to determine whether 
the Condorcet winner is located inside I m or not. 

In Step 2, it is established that the weighted median of Ai, . . . , A m is located inside 
the essential interval I m with a probability that quickly approaches 1 as m — > oo. As 
a corollary, the probability n e CR m ) of the Condorcet winner having type 6 converges 
to the corresponding conditional probability 7T e (7? m | , 7<") of a type-0 representative being 
pivotal where event *K comprises all realizations of k such that 7? m 's weighted median 
lies inside I m . 

In Step 3, we show that the random orderings of the k = ]Cee{i,...,r) ke representatives 
with ideal point realizations A ; £ I m asymptotically become equiprobable as m — > oo. 
It follows that, with a vanishing error, the respective conditional pivot probability 
n d ^R m \K) equals the expected aggregate Shapley value of type-0 representatives in I m . 

In Step 4, the strong convergence result for the Shapley value by Neyman (1982) 
is applied to our setting. Neyman's result implies that the aggregate Shapley 
value of type-0 representatives with ideal points in I m converges to their respective 



23 



aggregate voting weight in each considered weighted voting 'subgame' among the 
representatives with ideal points A, £ I m . 

Having established that n e ^R m ) is asymptotically proportional to the aggregate 
voting weight of all type-0 representatives with ideal points inside I m , aggregate 
probabilities are attributed to individual representatives in the final Step 5. 



A.2 Proof 

Step 1: Essential interval I m and vector k 

We begin by identifying a neighborhood of M and a sufficiently great number of 
representatives such that both the densities f e and the numbers of type-0 represen- 
tatives in 'R" 1 can suitably be bounded. This leads to the definition of intervals I m 
around M which later steps will focus on. Bounds for the probabilities of a type-0 
representative's ideal point falling inside I m , and more specifically into I m 's left or right 
halves, are provided in Lemma 2. The final part of Step 1 introduces the vector k as 
a type-specific summary of how many ideal points are located to the left of I m , inside 
I m , and to its right. 
First note that 



< u = min foQA) < f g (M) <u= max / ,(M) (30) 

_ 9'e[l,...,r} J J 8'e{l,...,r] 

for every £ {1, . . . , r}. Using the continuity of f in a neighborhood of M, which is 
implied by \fe(x) - /e(M)| < cx 2 , we can choose < e 2 < £'i such that 

|/ e (M) < fg(x) < ? -f e (M) (31) 

for all x £ [M - e 2 , M + e 2 ] and any specific 9 £ {1, . . . , r}. Inequality (30) can be used 
in order to obtain bounds 

\u < f e (x) < 2u (32) 

for all x £ [M - e 2 , M + £2] and all 6 £ {1, . . . , r} which do not depend on 6. Due to the 
existence of ra° we can also choose < e 3 < e 2 such that 

j6 (m) > jg > (33) 

for all m > and all 6 £ {1, . . . , r}. And we can determine < e 4 < £ 3 such that 

24 < u§ ■ (mp)^ < ufimf (34) 
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for all m > where m e = j8 (m) • m. 
Then define 



e(m) = m 8 (35) 



and note that e(m) < e 4 iff m > m 1 = > ra°. So, whenever we consider a 
sufficiently large number of representatives (specifically, m > m 1 ), inequalities (31)- 
(34) are satisfied. We refer to 

I m = [M - e(m), M + e(m)] (36) 
as the essential interval. The probability of an ideal point of type 6 to fall inside I m is 

M+e(m) 

Pe= j f e (x)dx. (37) 

M-e(m) 

For realizations in the left and right halves of I m we respectively obtain the probabilities 

M M+e(m) 

p g = J f g (x)dx and p g = J fe(x)dx, (38) 

M-e(m) M 

with p e + p e = p e . 

Lemma 2. For m^m 1 we have 

|/ fl (M)£(m)< p <^/ (M)£-(m), (39) 
|/e(M) £ (m) < p ,p e < ^/ e (M) £ (m), (40) 

_3 _3 

ufim g s < p <4wm 8 , and (41) 

1 3 3 

-Wj6m~ 5 < p , < 2um~\ (42) 

Proof. The inequalities can be concluded from (31)-(33), m e = p e m, and jS < 1. □ 
Now for any realization A of the ideal points in assembly let 

k e = #{;': t(;') = 6 and A ; - € [M - e(m),M + e(m)]} (43) 

denote the number of type-0 representatives with a policy position in the essential in- 
terval, i.e., no more than e(m) away from the expected sample median M. Analogously, 
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let 



\ B = #{;: t(;) = Q and Ay e (-oo,M - e(m))} (44) 

and 

k e = #{;': t(/) = 6 and Ay G (M + e(m), oo)} (45) 

denote the random number of type-0 representatives to the left and to the right of I m . 

One can conceive of A-realizations as the results of a two-part random experiment: 
in the first part, it is determined for each Ay whether it is located to the right of I m , to 
its left, or inside I m , e.g., by drawing a vector / = (h, ... ,l m ) of independent random 
variables where k = 1 (-1) indicates a realization of A; to the right (left) of I m and U = 
indicates A, £ I m (with probabilities \ - pg, \ - pg, and pg, respectively). This already 
fixes 1cg, k g , and k" g for each 9 e {1, . . . , r} and is summarized by the vector 

k = (^i, k\,ky . . . , \ r , k r , 

K\ (46) 

In the second part, the exact ideal point locations are drawn. It will turn out that those 
outside I m can be ignored with vanishing error; and the kg type-0 ideal points inside 
have conditional densities fg with 

f g (x) = i^-L f rxel m . (47) 

Pe 

Step 2: Type Q's aggregate pivot probability n e CR m ) converges to the conditional 
probability n e ('R m \ ( K) of type being pivotal in I m 

We next appeal to Hoejf ding's inequality 31 in order to obtain bounds on the probability 
that the shares of representatives and ^ with ideal points to the left, inside, or 

right of I m deviate by more than a specified distance from their expectations. These 
bounds will imply that one can condition on the pivotal ideal point lying inside I m in 
later steps of the proof with an exponentially decreasing error. 

n 

Hoeffding's inequality concerns the average X = \ ■ £ X t oi n independent bounded 

i i 

random variables X, £ [a ir bi] and guarantees 



Pr{|X-E[X]| > t] <2exp 



-2fn 2 
\i=i 



(48) 



31 



See Hoeffding (1963, Theorem 2). 
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Our specific construction will involve only random variables X, e [0, 1], so that 

Pr {|x - E[X]| > f) < 2 exp (-2t 2 n) . (49) 
We will put n = m e for a fixed 8 e {1, . . . , r}, so that n — > oo as m — > oo, and choose 

2 

t = n~s, which implies f(n) «: e(m). For this choice 

Pr {|X - E[X]| > n~i] < 2 exp (-2n&) , (50) 

i.e., the probability of "extreme realizations" exponentially goes to zero as m — > oo 
(and hence n = m e — > oo). 

Lemma 3. For eacfo 8 € {1, ... ,r} we have: 



(I) Prl^ 

v / ym g 



_2 _2 



> 1-2 exp 



(II) Prj 



m 



(HI) Pr{£ 



]) 



p - m 5 ,p + m ~ 5 



_2 _2 



> 1 - 2 exp 



> 1 - 2 exp 



(-2**) 
(-2m|). 



Proof. Let e {1, . . . , r} be arbitrary but fixed. For statement (I) we consider the n = tn g 
indices ji . . . , j m0 € {1, . . . , m\ of type and denote by X, the random variable which is 
1 if the realization A ;i lies inside the interval (-oo, M - e(m)) and zero otherwise. In the 



"1c 

notation of Hoeffding's inequality we have X = Since the probability that A ; - lies in 



the left half of I m is given by p g and f g (x)dx = fe(x)dx = \, the probability that Ay ( 
lies in the interval (-oo, M-e(ra)) is given by \-fe- Thus we have E[X] = \-pe and (50) 
implies (I). The statements (II) and (III) follow along the same lines (namely, by letting 
X, be the characteristic function of intervals [M - e(m),M + e(m)] and (M + e(m), oo), 
respectively). Note that m e ~ 2/5 <<c e{m) = m" 3/8 for large m. □ 

We can use the bounds on p e in (41) and that /3ra < m < m for m > m 1 > m° in 
order to conclude from (II) that for any given 8 G {1, . . . , r} 



up 2 e{m) ■m-m 5 <k g < 4ue(m) ■m + m^ 



(51) 



with a probability of at least 1 - 2 • exp {-2me^\ A further implication of observations 
(I)-(III) is: 
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Lemma 4. For m>m l the inequalities 



% < \m e (52) 



1 
2 

2, 1 



*e < ^m e (53) 



2 1 



are simultaneously satisfied for all £ {1, . . . , r} wff/z a probability of at least 1 - 6r 
exp(-2(jSra)sY 



Proof. The events considered in statements (I), (II), and (III) of Lemma 3 are realized 
for all 6 e {1, ... , r} with a joint probability of at least 



(l - 2 exp (-2QSm) * )) 3 " > 1 - 6r exp (-2(j8m) * ) , 



(56) 



since ra > j8ra for m > ra° and (1 - x) k > (1 - kx) is valid for all x £ [0, 1] and k £ N. If 
m > m 1 , we then have 



% 



4-*) 



3 m e upm e s 3 w 3 
m e + m e 5 < — — + m g s = — - m e ~ 5 



ufim 1 . 



<-m e (57) 



>0 



for any 6 £ {1, . . . , r}. The first inequality follows directly from (I), the second inequality 
uses (42), and the final inequality follows from (34). Analogous inequalities pertain to 

Moreover, we can conclude 



<r 2 r ^ I 1 A 3 - 2 Ve 2 

% + -kg > \--peJ-m e -m e ^ + —mg--m e 

<2p 



me 5 3 
m fl 5 + 

2 3 e 



m e 5 



'2p 



Ve\m e 



2 pe 2 



ra 5 s / 1 3 2 \ 



ra 5 s 



ufinig 
24 



-1 



> 2 m e- 



(58) 
(59) 

(60) 
(61) 
(62) 



>o 



28 



The first inequality uses (I) and (II); the second one employs (39) and (40); the third 
applies (41); and the final one invokes (34). Analogous inequalities pertain to k" Q + 

\U □ 

Lemma 4 implies that the respective unweighted sample median among represen- 
tatives of type 6 is located within I m for all 6 £ {1, ... , r} with a probability that quickly 
approaches 1. The same must a fortiori be true for the pivotal assembly member, i.e., 
the weighted median among all representatives. 

We collect in the set % all k = ("ki, k\, k\, . . . , %, k r , k" r ) such that the events considered 
by Lemma 3, (I)— (III), are realized for all 8 £ {!,..., r). The inequalities in Lemma 4 
then hold for any k £ < K. We can decompose the probability n 9 ^R m ) of some type-0 
representative being pivotal into conditional probabilities n e ( < R m \ ( K) and n e ( ( R m \-* ( K) 
which respectively concern only A-realizations where k £ TCandk £ 'K. Then Lemma 4 
implies 

n e CR m ) = Fr{K}-n e (ft m \'K) + Frh'K}-n e (<R m h'K) 

= 7z e CR m |7C) + 0(exp(-2ra5)). (63) 

Step 3: n e ('R m \ < K) converges to the expectation of type d's Shapley value inside I m 

Now condition on some k £ such that exactly Xe^e = k ideal points fall inside 
the essential interval, where k is asymptotically proportional to e(m) • m = mi by (51). 
Label them l,...,k for ease of notation and let g e S k denote an arbitrary element of 
the space S k of permutations which bijectively map (1, . . . , k) to some (ji, j k ). The 
conditional probability for the event that the k ideal points located in l m are ordered 
exactly as they are in g by the second step of the experiment is 

•••I fh(Xj 1 )---fj k (Xj k )dx jk ...dx h dx h . (64) 

e(m) Jx h J x k-x 

Lemma 5. For all m > m x , any ke<K with £ k e = k and permutation geS^we have 

Pim = I + I ■ 0{m~h (65) 



Proof. The premise \fe(x) - fe(M)\ < cx 2 for x £ I m permits us to choose 5 £ 0(t(m) 2 ) 



with & < \ such that 



(1 - 5) ■ f (M) < f (x) < (1 + 6) • f e (M) (66) 
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and, equivalently, 

(1 - 5) ■ f e (M) < f e (x) < (1 + 5) • f e (M) (67) 
for all types 1 < 6 < r and all x e I m . Integrating (66) on I m yields 

2e(m)(l - 5) • f e (M) <p e < 2e(m)(l + 5) • f e (M). (68) 

With these bounds we can conclude from f e (M) = ^ that 

i-« < i </ e (M)< i <1±^ (69) 

2e(m) ~ 2e(m)(l + 5) ~ Jey ' ~ 2e(m)(l - 5) ~ 2e(m) v ; 

because 1/(1 - 5) < 1 + 25. 

Using (1 - 5) k > 1 - fc<5 and (1 + 6) fc < 1 + 2fcS for fcS < l, 32 and noting that the 
hypercube [0, l] k can be partitioned into k\ polytopes {x £ [0, l] k : x ;i < x i2 < . . . < x ; J 
with equal volume, inequality (67) yields 



f h (M)...f jk (M)dx jk ...dx j2 dx h (70) 

e(m) Jxj, Jx;, , 



(1 - 5) 



£ (m) 

Jc p£(m) p£(m) f£( m ) 



... ldx jk ...dx j2 dx h (71) 

e(m) J-e(m) J-e(m) 



(1-6) 
fc! 



•/ yi (M).../ ; ,(M)-(2 £ (m^ (72) 



(69) (1-6) 2 * \-2kb 

a t £ — (73) 



Xf(m) p£{m) pe(m) 
... f n (M)...fl(M)dx jk ...dx j2 dx h (74) 
e{m) Jxi, Jxj, , 

(1 + bf 



and, analogously, 

->e(m) p£{m) pe(m) 
-e(m) Jx h ^ x i k -i 

^ ■ f h (M) . . . f jk (M) ■ (2e(m)) k (75) 
(69) (1 + 6) k (l + 25f (1 + 25) 2k l + 8k5 

~ k\ " k\ " ~~ kT~ " ( } 

32 The first statement is easily seen by induction on k. The second follows from 

k i, \ k „ k 



/lr\ 1 

(1 + 5)* = £1 ;W < i + ^ - (jfcsy < i + J < i + 2kd. 

</c6 



<e-l 

Since A: is asymptotically proportional to mi and e(m) 2 = m~i we can choose 6 e 0(m~i) with (fa5) ; < A:<5 
for j > 1 whenever m is large enough. 
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This implies 



vW) - 1 



< ^ (77) 



Because k 6 0(m») and 6 e 0(m »), the relative error \p(g\k) - {k^- l \j{k\y l tends to zero 
at least as fast as 0(ra~§). □ 

So even though the probabilities of the orderings g £ Sk of the fc agents inside 
I m differ depending on which specific g is considered and what are the involved 
representative types (i.e., which k is considered), these differences vanish and all 
orderings become equiprobable as m gets large. 

Type d's conditional pivot probability can be written as 

n e {<R m \<K) = Y j m-{ Yj ^ |k) }' (78) 

keX geS k : i/>(M=0 

where P(k) denotes the probability of k conditional on event {k £ 7C] and function 
ip: Tf x Sk — > {1, . . ■ j] identifies the type 6' of the pivotal member in K 11 when k 
describes how the representative types are divided between I m and its left or right, 
and g captures the ordering inside I m . Lemma 5 approximates the probability of 
ordering g conditional on k as l/kl, and one thus obtains 

n e ( < R m \ < K) = Y p ( k ) • <M k ) + 0{mr 1 *) (79) 

ke<K 

with 

<Mk)= E F (80) 

geS k :xWt,Q)=0 

Because a constant factor ^ pertains to each ordering g 6 <Sjt, <^e(k) equals the 
probability that, as the weights iVi,w 2 , . . . ,u>k of the k representatives inside I m are 
accumulated in uniform random order, the threshold ^(k) = q m - Xee{i,...,r] ^e^e is first 
reached by the weight of a type-0 representative. The term (pe(k) is, therefore, simply 
the aggregated Shapley value of the type-0 representatives in the weighted voting 
game defined by quota ^(k) and weight vector i^O\,w%, . . . ,Wk). Equation (79) states 
that 7T (^ m |*7C) converges to the expectation of this Shapley value (^e(k). 



Step 4: Type d's Shapley value </>e(k) converges to d's relative weight in I m 

Condition k £ 'K implies | • Zee)i,...,r} ^-e w e ^ < | • T,ee{i,...,r}^9 w e (see Lemma 4). 
And our premises guarantee that the relative weight of each individual representative 
in I m shrinks to zero. The "Main Theorem*" in Neyman (1982), therefore, has the 
following corollary: 
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Lemma 6 (Neyman 1982). Given that k e < K, 



(h g (k) = ^ r keWe (1 + u{m)) with lim \u(m)\ = 0. (81) 

Lg, =l kg>Wg> m->«> 

Proof. Neyman's theorem considers an infinite sequence of weighted voting games 
[q n ;zv n ] with n voters whose individual relative weights w" approach 0, and 
in which the relative quota q n is bounded away from and 100% (or at least 
lim n ^ 0O ^ f! /(max ; w") = oo). Neyman establishes that 33 

liml^^tO-V <| = (82) 

iz.Tr, 

holds for any sequence of voter subsets T n c {1, . . . , n}, where (pT„(l n '' w ") denotes their 
aggregate Shapley value. (We here consider q" = q(k)/wz, w" = (w\,Wi, . . . , Wkj/wz 
and T n = {i e N : t(z) = 6} for N = {1, . . . ,k) and w E = YiieN ^/- 34 ) 

It is trivial that (81) holds if Wg = = ^e(k). So we can assume w g > 0, and because 
there is at least the proportion j6 > of representatives from each type in I m for large 
m, the aggregate relative weight of 0-type representatives in I m is bounded away from 
0, i.e., 35 

lim inf m ^ — ^ > 0. (83) 

Le>=i k&wg, 

Therefore, not only the absolute error p.{m) made in approximating (pg(k) = (pT„(l n '' w ") 
by vr ke Z 6 — but also the relative error aim) = aim) I vr kol f s — must vanish as m — > 

oo. □ 



Step 5: Attributing aggregate pivot probabilities to individual representatives 

It then remains to disaggregate the pivot probabilities n e CR m ) and n e 'CR m ) of types 6 
and 6' to individual representatives i and The aggregate relative weight of type-0 
representatives in the essential interval satisfies 

kgivg _ p e (m)mp e wg(l + O(m-i)) _ Pe(m)p e w g , + ^ m _§yv 

(84) 

33 We somewhat specialize his finding and adapt the notation. 

34 Our notation leaves some inessential technicalities implicit: "K really refers to a family of such sets, 
parameterized by m; we implicitly consider a sequence of k-vectors such that n=^ooasm->oo. 
35 The limit itself need not exist because our premises do not rule out that, e.g., m e is periodic in m. 
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for any k e (see (II) in Lemma 3). 36 Combining this with equations (63), (79) and 
(81) yields 

lim ^— - = lim — — = lim - / — — — (85) 

m ^oo n o (<ftmj w ^ 00 fi e ,{m)pQ>We> m^oo p ,(m)f e >(M)we> 

for arbitrary 6, 6' € {1, ... , r}. Here, the final equality uses 

lim — = lim — — = — — tttt, (86) 

m ^ VQ , Mm) / ,( M ) 

J-e(m) J wx/ 

which can be deduced from (68). 

Our main result then follows from noting that the m e = (}g(m) ■ m representatives of 
type 9 in assembly K 11 are symmetric to each other and, therefore, must have identical 
pivot probabilities in K". Hence 

lim gjjg) = lim n<i) ^ m )IM m ) = fiWfl»i m 
nj(R m ) m 1 ^ 7i T 0')(^?'")/j3 T(;) (m) fj(M)wf 



A3 Remarks 

Let us end this appendix with remarks on possible further generalizations. First, the 
quadratic bound on fg's variation in a neighborhood of M could be relaxed by choosing 
different constants in equations (35) and (49): t(mg) = m~ hl with b\ < \ is all that is 
needed in order to ensure a vanishing error probability in (49); and e(m) = m~ hl with 
b 2 < b\ in (35) is sufficient for e(m) » t(m e ). Then a local bound \ fe(x) - /e(M)| < cx a for 
a > ^ is sufficient to establish Lemma 5. Requirement b 2 < b\ < \ leaves generous 
room for a < 2, but implies a > 1. 

Second, it is actually sufficient to assume local continuity of all fg at M, rather 
than any strengthening of this, 37 if one appeals to an unpublished result by Abraham 
Neyman. When, as in our setting, all voting weights have the same order of magnitude, 
the uniform convergence theorem of Neyman (1982) for the Shapley value can be 
generalized to hold for all random order values that are 'sufficiently close' to the Shapley 

36 To see the second equality note that for y e (0, \) we have = l + y + y 2 + ...<l+2y = l + O(y). 
Similarly, T z^>l + y = l + O(y) and so ^ = 1 + O(y). 

37 Local continuity of fg is obviously necessary: a modification of /e(M) - with fg(x) unchanged 
for x + M - would affect Wif g (M) but not 7i,( < R m ). Also the requirement of positive density at the 
common median cannot be relaxed. This is seen, e.g., by considering densities fi,fj where fi(x) = on a 
neighborhood N £ (M) while /y(M) = with fj(x) > for x e N £ (M) \ {M}; then m('R m )/nj('R m ) converges 
to rather than Wi/Wj. 
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value. More specifically, consider the expected marginal contribution of a voter i £ 
{1,...,*} 

4>i(v) = Ya M ■ Wi(9) U M) ~ v(T t (g))] (88) 

geS k 

in a weighted voting game v = [q; W\, ... , w{\, where any given permutation g £ Sk on 
N = {1, . . . ,k} has probability p(^), and Ti(g) c N denotes the set of i's predecessors in g, 
i.e., Tj(^) = {/: g(j) < g(i)}. The random order value 0(z?) equals the Shapley value (p(v) 
if p(g) = jj. This equiprobability can, for instance, be obtained by letting p be defined 
by the order statistics of a vector of random variables X = (Xi, . . . ,X k ) with mutually 
independent and [0, l]-uniformly distributed X\, ...,X k . The latter assumption can be 
relaxed somewhat without destroying the asymptotic proportionality of z's weight W{ 
and 0,(z?) which Neyman (1982) has established when <$(v) = <p(v): 

Theorem 3 (Neyman, personal communication). Fix L > 1. For every e > there 
exist 5 > and K > such that if v is the weighted voting game v = [q;w\, . . . ,Wu] 
with Wi,...,Wk > 0, Y}i=\ w i — 1/ K • maXjW; < ^ < 1 —K • max/W,-, max^Wj/Wj < L, 
and {p(g)} e es k in (88) is defined by the order statistics of independent [0, \\-valued random 
variables X\, ...,Xi with densities f such that 1 - 5 < f(x) < 1 + 5 for every x e [0, 1] and 
i e {1, . . . ,k} then 

k 

J^\wi-^i(v)\<e. (89) 

i=i 

Of course, one can equivalently let {p(g)}ges k be defined by the order statistics of 
independent J m -valued random variables with densities fi,...,fk, instead of [0,1]- 
valued ones, if the theorem's condition 1 - 5 < f(x) < 1 + 5 is replaced by the 
requirement that ^ < f(x) < ^ for all x £ I m . 

The values of 5 and L which one obtains for a given e in Theorem 3 apply to any 
value of k. We consider the weighted voting subgames played by the k = Zee{i,...,r] k g 
representatives with realizations A, £ I m for given k £ ( K. The relative weight of any 
such representative i, W\ = Wj/ Zee{i,...,r] k e we, approaches zero as m — > oo; and so does 
the maximum relative weight. Recalling that the corresponding subgame's relative 
quota q = q(k)/ Zee)i,...,r| ^0 w e is bounded by \ < q < |, the condition K • max, W[ < q < 
1 — K • max, Wi is satisfied when m is sufficiently large. Any null players with W[ = 
can w.l.o.g. be removed from consideration. Then all weights have the same order of 
magnitude, i.e., the choice of L such that max, , Wi/Wj < L holds for all k £ is trivial. 

Moreover, the conditional densities fg in our setup satisfy < f(x) < for 
every 6 £ {1, . . . , r} and x £ I m when m is large enough. Specifically, continuity of fe 
in a neighborhood of M implies that for any given e > there exists A(t) > with 
lim^o A(e) = such that 

(1 - A(e)) • / e (M) < f e (x) < (1 + A(e)) • / e (M) (90) 
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for all x G [M - t,M + e] and all 6 £ {l,...,r} (cf. inequality (31)). Similarly to 
inequality (39) we then conclude 



(1 - A(e))f e (M) ■ 2e < p e < (1 + A(e))/ (M) • 2e. (91) 
Combining the last two inequalities with inequality (47) yields 

a-**)) fi/eWs JI±^» (92) 



(1 + A(e)) • 2e ~ " /t7W ~ (1 - A(e)) • 2e 

So considering e = e(m) and any fixed 5, the conditional densities f e satisfy < 
/i(x) < for every e {1, . . . , r} and x G I m when m is sufficiently large. 

Hence, all premises in Neyman's unpublished Theorem 3 are satisfied by the 
corresponding weighted voting subgames of agents with ideal points in I m . Theorem 3, 
therefore, ensures the approximate weight proportionality of the aggregate random 
order value O of the type-0 representatives. Now if one recalls (78) and notices that 
the bracketed sum equals 0(z?) with v = [q; Wj u . . . , Wj k ] when /i, denote the 
representatives with ideal points in I m , we can replace Lemmata 5-6 by the following: 

Lemma 7. 

n e^m^q = k e w e (1 + (m)) wUh Hm | (m) | = Q (93) 

The proof of Theorem 1 can then be concluded by appealing to (63), hence 

7z (7T| , 7C) _ n 9 (<R m ) 
7i 0, (K m l < K) ~ m ™ n e '( < R m Y 

and equations (85)-(87). Importantly, the presumption \fe(x) - /e(M)| < cx 2 for x £ 
[M - £i,M + £i], which Lemma 5 required, is not needed by Lemma 7. It can hence 
be replaced in Theorem 1 by the simpler requirement that each f g is continuous in a 
neighborhood of M. 

Finally, the assumption that only a finite number of different densities and weights 
are involved in the chain H l c 'R 2 c 'R 3 c . . . could be loosened. However, it is 
critical that each representative's relative weight vanishes as m — > oo in order to apply 
Neyman's results; the asymptotic relation (14) fails to hold, for instance, for a chain 
with W\ = X/>i w j- And because our result depends on a vanishing relative error, which 
is considered neither by Neyman (1982) nor Theorem 3, 38 it is similarly important that 
the aggregate relative weight of each type of representatives is bounded away from 
zero. For instance, with just one representative having weight W\ = land/^Ow) = m— 1 



38 See, however, Lindner and Machover (2004), where conditions very similar to ours are considered 
for the Shapley and Banzhaf values, and the related discussion by Lindner and Owen (2007). 
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ones with w 2 = 2 (see equation (13)), lim^^ 7Zi(7? m ) = lim m ^ OT 7Z ; (7? m ) = for any j 1 
but the limit of 7ii(^ m )/7i / (^ m ) may fail to exist. 



B Proof of Theorem 2 

Theorem 2. Consider an assembly < R m,q with an arbitrary number m of constituencies and the 
relative decision quota q e [0.5; 1). For each i e {1, . . . , m} let A,- = t- p.( + ii, where \i\,...,\i m 
and all mutually independent random variables, e\,..., e m have finite means 

and variances, and \i\,...,\i m have an identical bounded density. Then 

lim — — — - = — -— . (28) 

The result easily follows from the definition of the Shapley value and the fact that 
the orderings which are induced by the realizations of the vectors A = (Ai, . . . , A m ) 
and p = ... , p, m ) will coincide with a probability which tends to 1 as t approaches 
infinity. To see the latter, ignore any null events in which several ideal points or 
constituency shocks coincide and let g(x) denote the permutation of {!,..., m] such 
that %i < Xj whenever g(i) < g(j) for the real-valued vector x = Oc;)je{i,...,m}- We then 
have: 

Lemma 8. For ie {1, . . . , m) and t > Olet K\ = t • + e,, where \i\,...,\i m and e\,...,e m are 
all mutually independent random variables, e\,...,e m have finite means and variances, and 
\i\,...,\i m have an identical bounded density. Then 

lim Pr(£(A f ) = g) = lim Pr(^) = g) = ±- (95) 
for each permutation gof{l,..., m). 

Proof. Let us denote the finite variance of e,- by d\ and let XI = (max, |E[ej]|) 3 . We 
can choose a real number k such that the bounded density function h of fi„ with 
i G {1, . . . ,m], satisfies h(x) < k for all x £ 1R. For any given realization p.j = x, the 
probability of the independent random variable /i; assuming a value inside interval 

2 2 2 [ 2\ 

(x-4r3,x+4r3)isboundedabovebyfc-8r3. We can infer that the event {l/^-fjyl < 4f"3 J, 

which is identical to the event - t[ij\ < 4^1, has a probability of at most k ■ 8t~z 
for any i ^ j e {1, ...,m}. And we can conclude from Chebyshev's inequality that 
Pr(|e, - E[ej]| < i3) is at least 1 - a? • f"3. For t > U, we have |E[ej]| < fs; and if 
\Si - E[e,]| < fs holds then also 



2^>\E[ei]\ + \Si-m]\>\Si\ (96) 
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by the triangle inequality. Hence, the probability for (96) to hold when t > U is 
Pr(|e;| < 2f 3) > 1 - a? • H for each i G {l,...,m}. 

Now consider the joint event that (i) |fjU; - fjU ; | > for a/Z pairs i £ j G {1, . . . , m] 
and (ii) that < 2£? for all i G {!,..., m\. In this event, the ordering of A[ , ... , is 
determined entirely by the realization of tp.\, ... , t\i m ; in particular, g(A l ) = q([i). Using 
the mutual independence of the considered random variables this joint event must 
have a probability of at least 



2/ ffl- / / \ m \ 

n(i-*-»-j)-n(i-«f-H)^- k + i>> H 

s=l i=l V > ' i=\ ) 



(97) 



for t > U. The right hand side clearly tends to 1 as t approaches infinity. It hence 
remains to acknowledge that any ordering has an equal probability of 1/ra! because 
\i\,...,\i m are i.i.d. □ 
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